KDTA: Automated Knowledge-Driven Text Annotation

نویسندگان

Katerina Papantoniou

George Tsatsaronis

Georgios Paliouras

چکیده

In this paper we demonstrate a system that automatically annotates text documents with a given domain ontology’s concepts. The annotation process utilizes lexical and Web resources to analyze the semantic similarity of text components with any of the ontology concepts, and outputs a list with the proposed annotations, accompanied with appropriate confidence values. The demonstrated system is available online and free to use, and it constitutes one of the main components of the KDTA (Knowledge-Driven Text Analysis) module of the CASAM European research project.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating a Verb Lexicon into a Syntactic Treebank Production

The creation of linguistically interpreted corpora is a tedious task and the automation of the annotation process is indispensable. A fully automated annotation is hardly possible to achieve, since it requires very sophisticated and large knowledge bases which are, themselves difficult to create. However, a "machine-aided approach" to the annotating process, using as many as available sources o...

متن کامل

Language Technology Support for Semantic Annotation of Icono-graphic Descriptions

The paper describes an approach for semantic annotation of multimedia objects implemented for the purposes of SINUS Project. Semantic annotations are supported by semantic annotation models based on ontological presentation of knowledge concerning Bulgarian Iconography. The process of semantic annotation includes automated data-lifting procedure and user-directed approach. The paper pays attent...

متن کامل

Distributional Framework for Emergent Knowledge Acquisition and its Application to Automated Document Annotation

The paper introduces a framework for representation and acquisition of knowledge emerging from large samples of textual data. We utilise a tensor-based, dis-tributional representation of simple statements extracted from text, and show how one can use the representation to infer emergent knowledge patterns from the tex-tual data in an unsupervised manner. Examples of the patterns we investigate ...

متن کامل

An approach to describing and analysing bulk biological annotation quality: a case study using UniProtKB

MOTIVATION Annotations are a key feature of many biological databases, used to convey our knowledge of a sequence to the reader. Ideally, annotations are curated manually, however manual curation is costly, time consuming and requires expert knowledge and training. Given these issues and the exponential increase of data, many databases implement automated annotation pipelines in an attempt to a...

متن کامل

Collaborative text-annotation resource for disease-centered relation extraction from biomedical text

Agglomerating results from studies of individual biological components has shown the potential to produce biomedical discovery and the promise of therapeutic development. Such knowledge integration could be tremendously facilitated by automated text mining for relation extraction in the biomedical literature. Relation extraction systems cannot be developed without substantial datasets annotated...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

KDTA: Automated Knowledge-Driven Text Annotation

نویسندگان

چکیده

منابع مشابه

Integrating a Verb Lexicon into a Syntactic Treebank Production

Language Technology Support for Semantic Annotation of Icono-graphic Descriptions

Distributional Framework for Emergent Knowledge Acquisition and its Application to Automated Document Annotation

An approach to describing and analysing bulk biological annotation quality: a case study using UniProtKB

Collaborative text-annotation resource for disease-centered relation extraction from biomedical text

عنوان ژورنال:

اشتراک گذاری